Automatic language identification using large vocabulary continuous speech recognition
نویسندگان
چکیده
We have developed a highly accurate automatic language identification system based on large vocabulary continuous speech recognition (LVCSR). Each test utterance is recognized in a number of languages, and the language ID decision is based on the probability of the output word sequence reported by each recognizer. Recognizers were implemented for this test in English, Japanese, and Spanish, using the Ricardo corpus of telephone monologues. When tested on the OGI corpus of digitally recorded telephone speech, we obtained error rates of 3% or lower on 2-way and 3-way closed-set classification of ten-second and one-minute speech
منابع مشابه
Continuous Sign Language Recognition – Approaches from Speech Recognition and Available Data Resources
In this paper we describe our current work on automatic continuous sign language recognition. We present an automatic sign language recognition system that is based on a large vocabulary speech recognition system and adopts many of the approaches that are conventionally applied in the recognition of spoken language. Furthermore, we present a set of freely available databases that can be used fo...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملImproving Continuous Sign Language Recognition: Speech Recognition Techniques and System Design
Automatic sign language recognition (ASLR) is a special case of automatic speech recognition (ASR) and computer vision (CV) and is currently evolving from using artificial labgenerated data to using ’real-life’ data. Although ASLR still struggles with feature extraction, it can benefit from techniques developed for ASR. We present a large-vocabulary ASLR system that is able to recognize sentenc...
متن کاملSession 3: Continuous Speech Recognition
The papers in this session focus on techniques for and applications of large-vocabulary continuous speech recognition. The technique oriented papers discuss techniques for channel compensation, fast search, acoustic modeling, and adaptive language modeling. The applications oriented papers discuss methods for using recognizers for language identification, speaker identification, speakersex iden...
متن کاملMalay language modeling in large vocabulary continuous speech recognition with linguistic information
In this paper, our recent progress in developing and evaluating Malay Large Vocabulary Continuous Speech Recognizer (LVCSR) with considerations of linguistic information is discussed. The best baseline system has a WER of 15.8%. In order to propose methods to improve the accuracies further, additional experiments have been performed using linguistic information such as part-ofspeech and stem. W...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996